PATENT 

ATTORNEY DOCKET NO: 06666-107001 / USC-3022 



What is claimed is: 



1 1. A machine translation decoding method comprising: 

2 receiving as input a text segment in a source language to be 

3 translated into a target language; 

4 generating an initial translation as a current target 

5 language translation; 

6 applying one or more modification operators to the current 
0 7 target language translation to generate one or more modified 
518 target language translations; 

42 9 determining whether one or more of the modified target 

RIIO language translations represents an improved translation in 

* 11 comparison with the current target language translation; 
11112 setting a modified target language translation as the 

iul3 current target language translation; and 

H14 repeating said applying, said determining and said setting 

15 until occurrence of a termination condition. 

1 2 . The method of claim 1 wherein the text segment 

2 comprises a clause, a sentence, a paragraph or a treatise. 

1 3. The method of claim 1 wherein generating an initial 

2 translation comprises generating a gloss. 
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1 4. The method of claim 3 wherein the gloss is a word-for- 

2 word gloss or a phrase- for-phrase gloss. 

1 5 . The method of claim 1 wherein applying one or more 

2 modification operators comprises changing in the current target 

3 language translation the translation of one or two words. 

1 6. The method of claim 1 wherein applying one or more 

2 modification operators comprises (i) changing in the current 

3 target language translation a translation of a word and 

4 concurrently (ii) inserting another word at a position that 

5 yields an alignment of highest probability between the source 

6 language text segment and the current target language 

7 translation, the inserted other word having a high probability of 

8 having a zero-value fertility. 

1 7 . The method of claim 1 wherein applying one or more 

2 modification operators comprises deleting from the current target 

3 language translation a word having a zero-value fertility. 

1 8 . The method of claim 1 wherein applying one or more 

2 modification operators comprises modifying an alignment between 

3 the source language text segment and the current target language 

4 translation by swapping non- overlapping target language word 

5 segments in the current target language translation. 
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1 9. The method of claim 1 wherein applying one or more 

2 modification operators comprises modifying an alignment between 

3 the source language text segment and the current target language 

4 translation by (i) eliminating a target language word from the 

5 current target language translation and (ii) linking words in the 

6 source language text segment. 

1 10. The method of claim 1 wherein applying one or more 

CI 2 modification operators comprises applying two or more of the 

ill 3 following : 

,|: 4 (i) changing in the current target language translation the 

HI 5 translation of one or two words; 
" 6 (ii) changing in the current target language translation a 



ill 7 translation of a word and concurrently inserting another word at 

tl 8 a position that yields an alignment of highest probability 

rf 9 between the source language text segment and the current target 

10 language translation, the inserted other word having a high 



11 probability of having a zero -value fertility; 

12 (iii) deleting from the current target language translation 

13 a word having a zero-value fertility; 

14 (iv) modifying an alignment between the source language text 

15 segment and the current target language translation by swapping 

16 non- overlapping target language word segments in the current 

17 target language translation; and 
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18 (v) modifying an alignment between the source language text 

19 segment and the current target language translation by 

20 eliminating a target language word from the current target 

21 language translation and linking words in the source language 

22 text segment . 

1 11 . The method of claim 1 wherein determining whether one 

2 or more of the modified target language translations represents 
y 3 an improved translation in comparison with the current target 

Kl 4 language translation comprises calculating a probability of 

•f* 5 correctness for each of the modified target language 

ill 6 translations. 



1 12. The method of claim 1 wherein the termination condition 

f* 2 comprises a determination that a probability of correctness of a 

O 3 modified target language translation is no greater than a 

4 probability of correctness of the current target language 

5 translation. 

1 13 . The method of claim 1 wherein the termination condition 

2 comprises a completion of a predetermined number of iterations. 

1 14 . The method of claim 1 wherein the termination condition 

2 comprises a lapse of a predetermined amount of time. 
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1 15. A computer- implemented machine translation decoding 

2 method comprising iteratively modifying a target language 

3 translation of a source language text segment until an occurrence 

4 of a termination condition. 

1 16. The method of claim 15 wherein the termination 

2 condition comprises a determination that a probability of 

3 correctness of a modified translation is no greater than a 

4 probability of correctness of a previous translation. 

1 17. The method of claim 15 wherein the termination 

2 condition comprises a completion of a predetermined number of 

3 iterations. 

1 18. The method of claim 15 wherein the source language text 

2 segment comprises a clause; a sentence, a paragraph, or a 

3 treatise. 

1 19. The method of claim 15 wherein the method starts with 

2 an approximate target language translation and iteratively 

3 improves the translation with each successive iteration. 

1 20. The method of claim 19 wherein the approximate target 

2 language translation comprises a gloss. 
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1 21. The method of claim 20 wherein the gloss comprises a 

2 word-for-word gloss or a phrase- for-phrase gloss. 

1 22. The method of claim 19 wherein the approximate target 

2 language translation comprises a predetermined translation 

3 selected from among a plurality of predetermined translations, 

1 23. The method of claim 15 wherein the method implements a 

2 greedy algorithm. 

1 24. The method of claim 15 wherein iteratively modifying 

2 the translation comprises incrementally improving the translation 

3 with each iteration. 

1 25. The method of claim 15 wherein iteratively modifying 

2 the translation comprises performing at each iteration one or 

3 more modification operations on the translation. 

1 26. The method of claim 25 wherein the one or more 

2 modification operations comprises one or more of the following 

3 operations: 

4 (i) changing one or two words in the translation; 

5 (ii) changing a translation of a word and concurrently 

6 inserting another word at a position that yields an alignment of 

7 highest probability between the source language text segment and 
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8 the translation, the inserted other word having a high 

9 probability of having a zero-value fertility; 

10 (iii) deleting from the translation a word having a zero- 

11 value fertility; 

12 (iv) modifying an alignment between the source language text 

13 segment and the translation by swapping non- overlapping target 

14 language word segments in the translation; and 

15 (v) modifying an alignment between the source language text 
li6 segment and the translation by eliminating a target language word 
!*fl7 from the translation and linking words in the source language 
^18 text segment. 

Sf " B 1 27. A machine translation decoder comprising: 

j|{ 2 a decoding engine comprising one or more modification 

[J 3 operators to be applied to a current target language translation 

4 to generate one or more modified target language translations; 

5 and 

6 a process loop to iteratively modify the current target 

7 language translation using the one or more modification 

8 operators, the process loop terminating upon occurrence of a 

9 termination condition. 
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1 28. The decoder of claim 27 wherein the process loop 

2 controls the decoding engine to incrementally improve the current 

3 target language translation with each iteration. 

1 29. The decoder of claim 27 further comprising a module for 

2 determining a probability of correctness for a translation. 

1 30. The decoder of claim 29 wherein the module for 

2 determining a probability of correctness for a translation 

3 comprises a language model and a translation module. 

1 31. The decoder of claim 29 wherein the process loop 

2 terminates upon a determination that a probability of correctness 

3 of a modified translation is no greater than a probability of 

4 correctness of a previous translation. 

1 32. The method of claim 27 wherein the process loop 

2 terminates upon completion of a predetermined number of 

3 iterations. 

1 33. The decoder of claim 27 wherein the one or more 

2 modification operators comprise one or more of the following: 

3 (i) an operator to change in the current target language 

4 translation the translation of one or two words; 
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5 (ii) an operator to change in the current target language 

6 translation a translation of a word and to concurrently insert 

7 another word at a position that yields an alignment of highest 

8 probability between the source language text segment and the 

9 current target language translation, the inserted other word 
10 having a high probability of having a zero-value fertility; 



11 (iii) an operator to delete from the current target language 

12 translation a word having a zero-value fertility; 

^13 (iv) an operator to modify an alignment between the source 

1^14 language text segment and the current target language translation 

I!U5 by swapping non- overlapping target language word segments in the 

j*?16 current target language translation; and 

E ;^17 (v) an operator to modify an alignment between the source 

WU8 language text segment and the current target language translation 

H 19 by eliminating a target language word from the current target 

N ; 20 language translation and linking words in the source language 

21 text segment. 

1 34. A computer- implemented tree generation method 

2 comprising: 

3 receiving as input a tree corresponding to a source language 

4 text segment; and 

5 applying one or more decision rules to the received input to 

6 generate a tree corresponding to a target language text segment. 
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1 35. The method of claim 34 wherein the one or more decision 

2 rules comprise a sequence of decision rules. 

1 36. The method of claim 34 wherein the one or more decision 

2 rules collectively represent a transfer function. 

1 37. The method of claim 34 further comprising automatically 

2 determining the one or more decision rules based on a training 

3 set . 

1 38. The method of claim 37 wherein the training set 

2 comprises a plurality of input -output tree pairs and a mapping 

3 between each of the input-output tree pairs. 

1 39. The method of claim 38 wherein the mapping between each 

2 of the input -output tree pairs comprises a mapping between leaves 

3 of the input tree and leaves of the paired output tree. 

1 40. The method of claim 39 wherein mappings between leaves 

2 of input-output tree pairs can be one-to-one, one-to-many, many- 

3 to-one, or many- to-many . 

1 41. The method of claim 38 wherein automatically 

2 determining the one or more decision rules comprises determining 
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■3 a sequence of operations that generates an output tree when 

4 applied to the paired input tree. 

1 42 . The method of claim 41 wherein determining a sequence 

2 of operations comprises using a plurality of predefined 

3 operations that collectively are sufficient to render any input 

4 tree into the input tree's paired output tree. 

1 43. The method of claim 42 wherein the plurality of 

2 predefined operations comprise one or more of the following: 

3 a shift operation that transfers an elementary discourse 

4 tree (edt) from an input list into a stack; 

5 a reduce operation that pops two edts from a top of the 

6 stack, combines the two popped edts into a new tree, and pushes 

7 the new tree on the top of the stack; 

8 a break operation that breaks an edt into a predetermined 

9 number of units; 

10 a create-next operation that creates a target language 

11 discourse constituent that has no correspondent in the source 

12 language tree; 

13 a fuse operation that fuses an edt at the top of the stack 

14 into the preceding edt; 

15 a swap operation that swaps positions of edts in the input 

16 list; and 
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17 an assignType operation that assigns one or more of the 

18 following types to edts: Unit, MultiUnit, Sentence, Paragraph, 

19 Multi Paragraph, and Text. 

1 44. The method of claim 43 wherein the plurality of 



2 predefined operations comprises a closed set including the shift 

3 operation, the reduce operation, the break operation, the create- 

4 next operation, the fuse operation, the swap operation and the 
Cl 5 assignType operation. 

%l 1 45. The method of claim 41 wherein determining a sequence 

«f 2 of operations results in a plurality of learning cases, one 

3 learning case for each input -output tree pair. 

I fa 

jlV i 46. The method of claim 45 further comprising associating 

5 2 one or more features with each of the plurality of learning cases 

3 based on context. 

1 47. The method of claim 46 wherein the associated features 

2 comprise one or more of the following: operational and discourse 

3 features, correspondence-based features, and lexical features. 

1 48. The method of claim 45 further comprising applying a 

2 learning program to the plurality of learning cases to generate 

3 the one or more decision rules. 
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1 49. The method of claim 4 8 wherein the learning program 

2 comprises C4.5. 

1 50. The method of claim 34 wherein the source language text 

2 segment comprises a clause, a sentence, a paragraph, or a 

3 treatise. 

1 51. The method of claim 34 wherein the target language text 

2 segment comprises a clause, a sentence, a paragraph, or a 

3 treatise. 

1 52. The method of claim 34 wherein the source language text 

2 segment and the target language text segment are different types 

3 of text segments. 

1 53. The method of claim 34 wherein each of the source 

2 language tree and the target language tree comprises a syntactic 

3 tree. 

1 54. The method of claim 34 wherein each of the source 

2 language tree and the target language tree comprises a discourse 

3 tree. 

1 55. A computer- implemented tree generation module 

2 comprising a predetermined set of decision rules that when 
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3 applied to a tree corresponding to a source language text segment 

4 generate a tree corresponding to a target language text segment. 



1 56. The module of claim 55 wherein the source language text 

2 segment comprises a clause, a sentence, a paragraph, or a 

3 treatise. 

1 57. The module of claim 55 wherein the target language text 

2 segment comprises a clause, a sentence, a paragraph, or a 

3 treatise. 

1 58. The module of claim 55 wherein the source language text 

2 segment and the target language text segment are different types 

3 of text segments. 

1 59. The module of claim 55 wherein each of the source 

2 language tree and the target language tree comprises a syntactic 

3 tree. 

1 60. The module of claim 55 wherein each of the source 

2 language tree and the target language tree comprises a discourse 

3 tree. 
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1 61. The module of claim 55 wherein the predetermined set of 

2 decision rules defines a transfer function between source 

3 language trees and target language trees. 

1 62 . A method of determining a transfer function between 

2 trees of different types, the method comprising: 

3 generating a training set comprising a plurality of tree 

4 pairs and a mapping between each tree pair, each tree pair 
CI 5 comprises a source tree and a corresponding target tree; 

SI 6 generating a plurality of learning cases by determining, for 

7 each tree pair, a sequence of operations that result in the 

111 8 target tree when applied to the source tree; and 
3i 9 generating a plurality of decision rules by applying a 

lj?10 learning algorithm to the plurality of learning cases. 

D 1 63. The method of claim 62 further comprising, prior to 

2 generating the plurality of decision rules, associating one or 

3 more features with each of the learning cases based on context. 

1 64 . A computer- implemented discourse-based machine 

2 translation system comprising: 

3 a discourse parser that parses the discourse structure of a 

4 source language text segment and generates a source language 

5 discourse tree for the text segment; 
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6 a discourse- structure transfer module that accepts the 

7 source language discourse tree as input and generates as output a 

8 target language discourse tree; and 

9 a mapping module that maps the target language discourse 
10 tree into a target text segment. 

1 65. The system of claim 64 wherein the discourse- structure 

2 transfer module comprises a plurality of decision rules generated 
CI 3 from a training set of source language-target language tree 

|n 4 pairs . 

i 5 
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